-
-
Notifications
You must be signed in to change notification settings - Fork 438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the MPI parcelport and change the zero-copy threshold to 8192 #6274
Conversation
Can one of the admins verify this patch? |
{ | ||
chunk_buffers_.resize(num_zero_copy_chunks); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like (partially) duplicated code from https://github.com/STEllAR-GROUP/hpx/pull/6274/files#diff-c637d9bc3526314c8c727eded642f51182eeaf3db73986599d742e0cbfd9564fR84. Could that be unified/simplified?
|
||
void run() noexcept | ||
{ | ||
util::mpi_environment::scoped_lock l; | ||
get_next_free_tag(l); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is acquiring this lock still necessary?
int tag = next_tag.fetch_add(1, std::memory_order_relaxed) % | ||
(util::mpi_environment::MPI_MAX_TAG - 1) + | ||
1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add an assert when tag overflows? ... I'm not sure.
1df2d4f
to
1351724
Compare
@hkaiser I think this PR is ready for you to take a look at again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
@JiakunYan could you please rebase this PR to fix the issues reported (those were caused by problems on the master branch when you branched this off). |
…tion threshold from 128 to 8192 - Replace the lock-based tag provider with an atomic variable - Make the header message size dynamic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
bors merge |
Build succeeded! The publicly hosted instance of bors-ng is deprecated and will go away soon. If you want to self-host your own instance, instructions are here. If you want to switch to GitHub's built-in merge queue, visit their help page.
|
Improve the MPI parcelport:
Since the maximum size of the dynamic header message is set to be the zero-copy serialization threshold, I also changed it from 128B to 8192B. (The message size that MPI switches from eager protocol to rendezvous protocol is usually around 8K - 64K).
Experiments: Octo-Tiger, max_level=6, SDSC Expanse, 32 nodes, 128 threads per node.
With
hpx.parcel.mpi.sendimm=0
, this PR improves the total execution time from 11.77s to 10.68s.With
hpx.parcel.mpi.sendimm=1
, this PR improves the total execution time from ~60s to 11.72s.